Three Visual Cluster Validity Methods for Object and Relational Data

نویسنده

  • James C. Bezdek
چکیده

This talk is about three visual cluster validity methods developed by the authors that can be used for both object and relational data sets. The original method VAT (visual assessment of clustering tendency) works nicely for dissimilarity data up to about n = 5000 objects, but VAT quickly bumps up against storage and resolution limits, and is of limited utility for large data sets. The second method in the family is sVAT (scalable VAT). This algorithm generates a sample taken from the rows (or columns) of very large, square, relational data, and builds a VAT image on the sample. sVAT is shown to be exact when the data contain compact, separated clusters in the sense of Dunn, and sVAT can be used with arbitrarily large data sets. The third method is called coVAT. This method can be used to estimate the number of clusters in objects represented by the rows and columns of a rectangular dissimilarity matrix, as well as the number of mixed and unmixed clusters in the union of the row and column objects. We show examples of using coVAT to detect the number of clusters in the four canonical clustering problems that are present in rectangular relational data. About the speaker: Jim Bezdek received the BSCE from the U. of Nevada (Reno) in 1969, and the Ph.D. in Applied Math from Cornell in 1973. He is currently the Nystul Professor of Computer Science at the University of West Florida. His previous experience includes the directorship of Boeing's HTC Inf. Proc. Lab, and a term as head of Computer Science at the University of South Carolina. Jim's interests include woodworking, optimization, motorcycles, pattern recognition, fishing, vision and image processing, skiing, computational neural networks, blues music, medical applications and cigars. Jim is the founding editor of the Int'l. Jo. Approximate Reasoning and the IEEE Transactions on Fuzzy Systems. He has been a distinguished lecturer for the IEEE and ACM, is an IEEE fellow, and was president of the IEEE Neural Networks Council in 1997-1998. Current research topics: multiple prototype classifier designs, mixed fuzzy-possibilistic c-means clustering models, rule extraction with clustering, generalized nearest prototype classifier networks, fusion of heterogeneous fuzzy data, target recognition with LADAR data, mammographic image analysis, topics in cluster validity, robotic control models, fuzzy learning vector quantization, clustering with genetic algorithms, and acceleration of image processing algorithms. EUSFLAT LFA 2005

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Clustering in Relational Data and Ontologies

This dissertation studies the problem of clustering objects represented by relational data. This is a pertinent problem as many real-world data sets can only be represented by relational data for which object-based clustering algorithms are not designed. Relational data are encountered in many fields including biology, management, industrial engineering, and social sciences. Unlike numerical ob...

متن کامل

Comparison of Object Relations, Personality Organization, and Personal and Relational Meaning of Life in Psychology Graduates vs. other Students in Lahijan Azad University

Aim: We conducted the present study to compare Object relations, personality organization, personal meaning of life, and Relational meaning in life among Islamic Azad University, Lahijans branch's students.  Method: The research design was post-event (causal-comparative). The sample included 200 (100 psychology students and 100 students from other majors) selected based on convenience sampling...

متن کامل

Support Tools for Visual Information Management

Visual applications need to represent, manipulate, store, and retrieve both raw and processed visual data. Existing relational and object-oriented database systems fail to offer satisfactory visual data management support because they lack the kinds of representations, storage structures, indices, access methods, and query mechanisms needed for visual data. We argue that extensible visual objec...

متن کامل

Kernelized Non-Euclidean Relational c-Means Algorithms

Successes with kernel-based classification methods have spawned many recent efforts to kernelize clustering algorithms for object data. We extend this idea to the relational data case by proposing kernelized forms of the non-Euclidean relational fuzzy (NERF) and hard (NERH) c-means algorithms. We show that these relational forms are dual to kernelized forms of fuzzy and hard c-means (FCM, HCM) ...

متن کامل

Parallel Spatial Pyramid Match Kernel Algorithm for Object Recognition using a Cluster of Computers

This paper parallelizes the spatial pyramid match kernel (SPK) implementation. SPK is one of the most usable kernel methods, along with support vector machine classifier, with high accuracy in object recognition. MATLAB parallel computing toolbox has been used to parallelize SPK. In this implementation, MATLAB Message Passing Interface (MPI) functions and features included in the toolbox help u...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005